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(54) System for transmission of embedded applications over a network 



(57) A system and method for transmitting embed- 
ded applications over a network is disclosed, wherein a 
user of a computer-controlled network client, such as a 
remote control device used for controlling a network of 
computer-controlled home entertainment devices, or a 
Web browser running on a Web client, can request and 
receive compound documents that include embedded 
applications and/or data files that can only be processed 
(i.e. , imaged or played) by handlers that are not resident 
on the client. In addition to embedded documents, the 
compound documents that are transmitted over the net- 
work can reference flat files (e g, image, audio, or text 
files), and other compound documents. Whenever a cli- 
ent receives a compound document, the client deter- 
mines whether it has access to all of the documents ref- 
erenced in the compound document and, if not, requests 
the documents to which it does not have local access. 
So that the multiple documents embedded in a com- 
pound document can be simultaneously output by the 
client to a multimodal output device, the requestor in- 
cludes a multi-tasking real-time kernel. This scheme al- 
lows a client user to download documents from a server 
that include embedded applications, which when exe- 
cuted on the client, allowthe client to control the servers 
using commands downloaded from the servers. 
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Description 

The present invention relates generally to networks 
of clients and servers, and particularly to networks 
where clients are allowed to control the operation of 
some of the servers. 

BACKGROUND OF THE INVENTION 

The present invention is an improvement over com- 
pound document transmission capabilities provided by 
current networks. 

For the purposes of this document, a "network" is 
understood to be a plurality of interconnected, compu- 
ter-controlled devices that are capable of cooperative 
interactions. In most networks (for example, the Inter- 
net), the networked devices are either clients (users of 
documents) or servers (providers of documents). In 
such a network architecture, a client is able to download 
documents and have services performed remotely by 
sending appropriate messages to the particular "server 
(s)" that is(are) responsible for performing the service 
or storing the desired documents. Of course, for the cli- 
ent to be able to do anything with downloaded docu- 
ments, those documents must be provided in a format 
that allows the client to display or execute the document. 

At the Internet's inception, this compatibility did not 
pose a great challenge. Many of the documents stored 
on servers were simple flat text (i.e., ASCII) files which, 
after being downloaded using a standard protocol such 
as TCP/IP, could be displayed on a client with a common 
text editor or viewer. Later, flat image files of various for- 
mats (TIFF, GIF, JPEG, etc.) became available on Inter- 
net servers, which required special graphics viewers to 
be available on the client. Over time, as the number of 
documents stored on the Internet and the variety of doc- 
ument formats grew, it became clear that there was a 
need for an Internet document transmission facility that 
would allow a user painlessly to view Internet docu- 
ments of various, even mixed formats (e.g., compound 
documents that incorporate ASCII text and differently- 
formatted graphics), and easily find and view other doc- 
uments related to the document just downloaded. This 
need was addressed by the World-Wide Web. 

The World-Wide Web ("WWW n ) links many of the 
servers making up the Internet, each storing documents 
identified by unique universal resource locators (URLs). 
Many of the documents stored on Web servers are writ- 
ten in a standard document description language called 
HTML (hypertext markup language). Using HTML, a de- 
signer of Web documents can associate hypertext links 
or annotations with specific words or phrases in a doc- 
ument (these hypertext links identify the URLs of other 
Web documents or other parts of the same document 
providing information related to the words or phrases ) 
and specify visual aspects and the content of a Web 
page. 

A user accesses documents stored on the WWW 



using a Web browser (a computer program designed to 
display HTML documents and communicate with Web 
servers) running on a Web client connected to the Inter- 
net. Typically, this is done by the user entering the URL 

s of a desired document or selecting a hypertext link (dis- 
played by the Web browser as a highlighted word or 
phrase) within a document being viewed with the Web 
browser. The Web browser then issues a HTTP (hyper- 
text transfer protocol) request for the requested docu- 

10 ment to the Web server identified by the requested doc- 
ument's URL. In response, the designated Web server 
returns the requested document to the Web browser, al- 
so using the HTTP, and the Web browser displays the 
document locally, including any text and images asso- 

is ciated with the document. The document delivery capa- 
bilities and ease of use features provided by the Web 
and Web browsers are invaluable. However, HTML and 
the Web would be even more useful if they provided sup- 
port for embedded applications in HTML documents. 

20 Currently, there is no way to imbed executable code 
fragments, or links to executable code fragments, within 
an HTML Web page so that the code fragments are ex- 
ecutable on a Web client. Such a capability would be 
very useful as the embedded code fragments could 

25 range from simulations, sound clips or video clips inter- 
actively running within a Web page, to communications 
routines or application programs that could be triggered 
by a user of the Web browser from the embedding com- 
pound document. Moreover, these capabilities could be 

30 used to allow a user to download data from a Web server 
without being concerned that the appropriate type of 
handler is resident on their Web client. That is, a Web 
document embedding a particular type of data could al- 
so include an address to the appropriate handler for the 

35 data so that, if needed, the user's Web browser could 
find, download and execute the needed handler on the 
data. 

Aside from the use of embedded applications in 
Web pages, other uses for embedded documents within 

40 networks of computer controlled devices are apparent. 
For example, in home entertainment systems, where 
there are multiple components, each with a unique com- 
mand set (sometimes very complex) and unique user 
interfaces (typically unintuitive), documents with em- 

45 bedded application capabilities could be adapted to pro- 
vide a highly adaptable universal remote control. Such 
a system would allow a user to interact with the various 
components using a visually-oriented remote control 
device that displays user control options (about which 

so the remote is ignorant) and other information, such as 
context sensitive help messages and graphics that are 
downloaded from the component being controlled. I.e., 
such a universal remote would be able to control devices 
about whose functionality it has no pre-programmed 

55 knowledge. 

Thus, there is a need for a system and method for 
embedding applications, or code fragments, in docu- 
ments transmitted over a network between computer- 
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controlled network devices. This system and method 
should allow the client to request and receive from a 
server an executable program that the client can then 
execute in the context of the document in which the ap- 
plication was embedded. The executable programs 5 
should be of at least four types: (1) output code that, 
when executed, produces a visual or audible manifes- 
tation (e g, graphical or sound simulations), (2) meta- 
knowledge code that can advise a user regarding legal 
interactions with the document in which the code frag- 10 
ment was embedded, (3) contextual code that can 
sense and indicate the processing context of the com- 
pound document in which the code fragment was em- 
bedded; and (4) handlers for embedded data. 

15 

SUMMARY OF THE INVENTION 

In summary, the present invention is a system and 
method for transmission of embedded documents over 
a computer network that meets the needs set out above. 20 

More particularly, the present invention is a system 
for transmitting embedded documents over a network 
that includes at least one server and at least one client, 
each server and client including a computer controller 
and a memory and having a unique network I D. The sys- 25 
tern includes a requestor that is executable on a client's 
computer controller, which is configured to control mes- 
sages issued by the client on the network. One of these 
messages is a document request message that in- 
structs a particular one of the servers to transmit to the so 
client a particular compound document, where a com- 
pound document is a document that references or in- 
cludes a plurality of embedded documents that can be 
executable code fragments, flat documents or other 
compound documents. 55 

Another aspect of the present invention is a provider 
that is executable on a server's computer controller. The 
provider is responsive to messages directed to the serv- 
er. For example, the provider is configured to respond 
to the document request message by causing the server 40 
to transmit to the client the particular requested com- 
pound document. Upon receiving a particular com- 
pound document, the requestor is configured to retrieve 
at least a subset of the embedded documents refer- 
enced by the particular compound document that are 45 
not stored in the client's memory and form an assembled 
compound document including the flat documents and 
the executable code fragments. 

The present invention is also a method for transmit- 
ting embedded documents over a network that includes 50 
at least one server and at least one client coupled to the 
network, each server and client including a computer 
controller and a memory and having a unique network 
ID. As the first step in the method, a client issues a doc- 
ument request message on the network to a particular 55 
server, where the document request message desig- 
nates a particular compound document to be returned 
to the client by the particular server. Next, in response 



to the document request message, the particular server 
returns the particular compound document to the client. 
This compound document includes references to a plu- 
rality of embedded documents, each of which has a type 
selected from executable code fragments, flat docu- 
ments and other compound documents. Upon receiving 
the compound document from the server, the client is- 
sues a plurality of document request messages to re- 
trieve via the network any flat documents and executa- 
ble code fragments referenced by the compound docu- 
ment that are not stored in the client's memory. Finally, 
the client forms an assembled compound document in- 
cluding the retrieved flat documents and executable 
code fragments. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Examples of the invention will now be described in 
conjunction with the drawings, in which: 

Figure 1 is a block diagram of a preferred embodi- 
ment. 

Figure 2 is a more detailed block diagram of the pre- 
ferred embodiment of Figure 1. 

Figure 3 is a data flow diagram showing how the 
preferred embodiment responds to a user's selection of 
device control and initialization options from a displayed 
default document. 

Figure 4 is a data flow diagram showing how the 
preferred embodiment transforms a compound docu- 
ment into an assembled document then into output doc- 
uments with embedded applications. 

Figure 5 is a flow chart showing the steps of the pre- 
ferred embodiment. 

Figure 6 is a depiction of an alternative embodiment 
where the clients, servers and network of Figure 1 are 
Web clients, Web servers and the Internet, respectively. 

DESCRIPTION OF THE PREFERRED EMBODIMENT 

Referring to Figure 1, there is shown a block dia- 
gram of the preferred embodiment, in which a multiplic- 
ity of device enclosures 110a-c (hereinafter referred to 
as "enclosures") are networked with a remote control 
device 120 using RF transmissions received and trans- 
mitted via a radio frequency (RF) receiver/transmitter 
118a-c, 128 provided in each respective enclosure 
110a-c and the remote control device 120. In the pre- 
ferred embodiment, the remote control device 120 acts 
as the network client (i.e. a user of documents) and the 
enclosures 110a-c the network servers (i.e., providers 
of documents), each of which has a unique network ad- 
dress. 

Each of the enclosures corresponds to a typical 
piece of equipment in a home entertainment system. For 
example, the enclosures 110a, 110b correspond to a TV 
set and a VCR unit, respectively. Other types of enclo- 
sure are represented in Figure 1 by the generic enclo- 
sure 1 1 0c. In addition to the RF receiver/transmitter 1 1 8, 
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each of the enclosures 110a-c includes a computer con- 
troller 112a-c, a characteristic device 114a-c and mem- 
ory 116a-c (some of which is nonvolatile), all of which 
are coupled to the controller 112. 

Each characteristic device performs the function 5 
that people identify with the enclosure. For example, 
where the enclosures 110a, 110b are a TV enclosure 
and a VCR enclosure, the respective characteristic de- 
vices are a TV tuner/display 114a and a VCR tuner/re- 
corder/player 1 1 4b. Typically, each of these devices can 
be manually controlled via a front panel provided in the 
enclosure 110. Alternatively, in the remote control mode 
addressed by the preferred embodiment, each charac- 
teristic device 1 1 4 is controlled by the device's local con- 
troller 112 based on appropriate control messages is- 
sued over the RF network by the remote control device 
120. One of the keys to the present invention is that 
these control messages and the knowledge of how to 
control a particular device are not prestored in the re- 
mote control device 120. Rather the remote control de- 
vice 120 dynamically downloads whatever information 
it needs, including executable code fragments and flat 
files, from the enclosure 110 to be controlled. For each 
enclosure 110a-c, this control information and an oper- 
ating system for the controller 1 1 2a-c, are non-volatilely 
stored in a memory 1 1 6a-c. 

The remote control device 1 20 (hereinafter referred 
to as the "remote" or "remote control") is an adaptable, 
universal remote that can control any of the enclosures 
1 1 0a-c and guide a user through the procedures for con- 
trolling the enclosures 110a-c. Principal elements of the 
remote control device 1 20 include the RF transmitter/ 
receiver 1 28, a computer controller 1 22, a memory 1 26 
(some of which is non-volatile), a multimode output de- 
vice 130, a user input device 132 and a name server 
134, which is a piece of software that runs in the con- 
troller 122, all of which are interconnected. The compu- 
ter controller 122 coordinates all operations of the ele- 
ments of the remote control 120 device in conjunction 
with the memory 126, which stores an operating system 
for the controller 122, initialization programs and files, 
and information downloaded from the enclosures 110a- 
c. A user interacts with the remote 120 via the user input 
device 1 32 and the output device 1 30, which can display 
still or video images and output audio. The user input 
device 1 32 can be physically distinct from the output de- 
vice 130 (e.g, a keypad), or can be a touch sensitive 
matrix overlaying the display and video sections of the 
output device 1 30, which allows the user to interact with 
the images on the output device. Based on the user in- 
puts, the controller 1 22 issues different control messag- 
es to the enclosures, these messages and the process 
of determining which message to issue having been pre- 
viously downloaded by the remote control device 120 
from the enclosure being controlled. Additional details 
of the enclosures 110, remote control 120, and informa- 
tion exchanged between them, are now discussed in ref- 
erence to Figure 2. 



Referring to Figure 2, there is shown a detailed 
block diagram of the preferred embodiment in which the 
remote control device 1 20 is interacting with a single de- 
vice enclosure 110. This figure shows additional details 
of the output device 130 and the memories 126, 116, 
which are now described. 

The output device 130 includes two distinct sec- 
tions, a display section 1 30a, which can display video 
or still images, and an audio section 1 30b, which can 
play sound clips. These two different output modalities 
are provided by the output device 1 30 so that all features 
of the output document 170 provided by the controller 
122 can be fully realized. 

The memory 1 26 of the remote control 1 20 provides 
both non-volatile and volatile storage capabilities, the 
non-volatile storage capabilities being provided to store 
program and data items that the remote control device 
120 requires at initialization. These non-volatilely- 
stored items include an initialization program 126.1, de- 
fault documents 126.2, name server registry 126.3 (i.e, 
network addresses of the devices 110a-c ) and an op- 
erating system 1 26.5, which incorporates a real-time op- 
erating system kernel 126.4. The operating system 
126.5 is a program that executes in the controller 122 
whenever the remote control device 120 is up and run- 
ning. Its duties include handling the controller 122's in- 
teractions with external devices, such as the user input 
device 1 32, display 130 and the RF transmitter/receiver 
128, and managing programs being executed by the 
controller 122 (e.g., loading programs into the memory 
1 26 for execution and handling program requests for ac- 
cess to external devices). The operating system 126.5 
can operate in at least one of two modes. When the op- 
erating system 126.5 is functioning in a single-tasking 
mode, software jobs are performed to completion one 
at a time, with no consideration given to external time- 
constraints. The operating system 126.5 can also func- 
tion in a multi-tasking, real-time mode, in which multiple 
jobs are performed simultaneously and under external- 
ly-imposed time constraints. These real-time capabili- 
ties are provided by the real-time kernel 126.4, which 
determines how the operating system allocates 
processing time in the controller 122 among multiple 
jobs so that each of the jobs runs in real-time. As will be 
explained later, these real-time capabilities are desira- 
ble in the preferred embodiment, where multiple code 
fragments, some of which provide user interactability, 
could be running simultaneously on the controller 122. 
We will now describe howthe controller 1 22 is initialized, 
which process is driven by the initialization program 
126.1. 

Referring to Figure 3, there is shown a data flow 
diagram depicting data transformation/transmission ac- 
tions performed by the controllers 112 and 122 in re- 
sponse to some typical user interactions with the initial- 
ization program 1 26. 1 . This diagram shows data objects 
as rectangles and components that act on the data ob- 
jects as diamonds. Some of the diamonds appear mul- 
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tiple times, representing components, such as the re- 
mote controller 122, that perform multiple actions. The 
initialization program 126.1, which is executed by the 
operating system 126.5 whenever the remote 120 is 
powered up, performs some housekeeping routines be- 
fore handing off control of the remote 1 20 to a user. 
Among these housekeeping duties, the initialization pro- 
gram 126.1 causes the controller 122 to output the de- 
fault document(s) 1 26.2 to the multimodal output device 
1 30, making available to the user a set of selectable re- 
mote options. This default document 1 26.2 defines a set 
of user-selectable enclosure control options 210 (linked 
to the displayed control options 21 0') and remote setup 
options 212 (linked to the displayed setup options 21 2') 
that can be executed by the controller 122 without first 
issuing messages on the network. In the preferred em- 
bodiment, each enclosure control option 210 includes a 
first field specifying a local name of a device 110 to be 
controlled, which local name is correlated with the name 
of a device registered in the name server registry 126.3, 
and a second field specifying the name of a message to 
be issued to the device named in the first field when the 
control option is selected by a user from the display 1 30. 
For example, referring to Figure 3, the two fields of the 
default document entry 210a are: 

device = TV_enclosure; and 
message = req_doc(control_TV). 

This particular entry 210a is linked to a user-selectable 
enclosure control option displayed by the controller 1 22 
as "control TV" 210a'. ■ 

When the user selects one of the control options 
210' from the display 1 30, the controller 122 determines 
the network ID of the appropriate device by consulting 
the name server registry 126.3 and then issues the as- 
sociated message 150 to the specified device. For ex- 
ample, as shown in Figure 3, when the user selects the 
displayed option 210a' ("controLTV"), the controller 1 22 
determines that the TV enclosure has the network ad- 
dress 1 1 0a and then issues to that address a document 
request message 150 for the "control_TV n document. 
Upon receiving a document request message, a control- 
ler 112, under control of its operation system 116.1, 
processes the message and responds accordingly. This 
response could be to implement an action, such as tun- 
ing in a TV channel, or returning a compound document 
140 to the remote control 120. For example, in the situ- 
ation of Figure 3, the TV controller 112a simply returns 
to the controller 122 the contents of the compound doc- 
ument "control_TV" 140a, which specifies a basic set of 
operations for controlling the TV 1 1 4a that the controller 
122 displays on the multimode display 130. 

As stated above, the default document 126.1 also 
provides a list of system configuration options 212 
(linked to displayed system options 212') that can be se- 
lected by a user from the user input device 1 32. Note 
that these options do not include an option that allows 



a user to "install new network devices." This is because 
the preferred embodiment performs network installa- 
tions automatically. In this automatic installation proc- 
ess, as soon as the user plugs in a new enclosure 110, 

5 the new enclosure's controller 112 awakes up" and be- 
gins operating under control of its operating system 
116.1. Recognizing its "new" status, the operating sys- 
tem 116.1 asks its associated name server 119 to reg- 
ister the enclosure. In response, that name server 119 

10 broadcasts on the network the name and address of the 
new enclosure to the other enclosures connected to the 
network. Upon receiving this message, the name serv- 
ers running on the controllers 112 and 122 in the other 
enclosures register the new enclosure by updating their 

15 name server registries with the new device's network 
address and name. For example, if a user adds a laser 
disc player 1 1 0c, a tuner 1 1 0d and a CD player 1 1 0e to 
the network, the name server running on the controller 
122 would add laser disc, tuner and CD player entries 

20 to the name server registry 1 26.3 as shown in Figure 3. 
After registering the new enclosures 1 1 0c-d, the control- 
ler 1 22 downloads from each the name of an initial mes- 
sage (such as the control_TV message), references to 
which it adds to the default document 1 26.2 as the con- 

25 trol options 21 0c'-e'. 

If the remote 120 loses its memory (including the 
name server registry 126.4) or if a new remote 120 is 
added to the network, the blank or new remote 120 is 
programmed to broadcast a request message asking 

30 the enclosures 110 to register themselves if that would 
be appropriate (some of the enclosures might not be 
compatible with the remote). In response, the enclo- 
sures would register themselves as described above. 
Collision avoidance procedures, which are well known 

35 jn the art of networks, ensure that each of the registering 
enclosures is allowed to access the network. 

In addition to storing addresses of all enclosures on 
the network, each name server 119, 134 lists the names 
and internal address of all of the documents (including 

40 compound documents, flat files and code fragments) 
that are stored in its memory 1 1 6, 1 26 and possible other 
relevant documents stored on other network nodes. 
This information allows the controller 122 to easily lo- 
cate documents (for sending or displaying). Additional 

45 information on these and other aspects of name servers 
are provided in Sanjay Radia, Michael N. Nelson, and 
Michael L. Powell, The Spring Name Service , from: A 
Spring Collection, A Collection of Papers on the Spring 
Distributed Object-Oriented Operating System, Sun- 

50 Soft, September 1 994, which is incorporated herein by 
reference. 

Referring again to Figure 2, the memory 116 of a 
device 110 includes compound documents 140, flat files 
1 42, code fragments 1 44 and an operating system 116.1 
55 that controls operations of the controller 112 whenever 
the device 110 is powered up. These objects 140, 142 
and 144 encapsulate all of the information required by 
the remote 1 20 to control the operation of the device 1 1 0 
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on which the objects are stored. In the preferred embod- 
iment, each compound document 140 contains refer- 
ences to flat files (FF_refs), code fragments (CF_refs) 
and other compound documents (CD_refs) that are 
meant to be displayed/executed by the controller 1 22 in 
a coordinated fashion, typically in a single window on 
the display 1 30. For example, a root compound docu- 
ment, such as the control_TV document mentioned 
above, contains references to objects 140, 142, 144 
that, when displayed together by the controller 122, 
present a top-level listing of the TV's basic control op- 
tions. In this approach, the flat file references (FF_refs) 
refer to text, graphics, or audio files to be output by the 
controller 122 and the code fragment references 
(CF_refs) refer to embedded applications to be execut- 
ed by the controller 122 while displaying the compound 
document 140 or to handlers for the flat files 142 refer- 
enced in the compound document 140. 

In the preferred embodiment, each CD_ref, FF_ref 
or CF_ref includes two fields: 

- "addr" (a document address); and 

- "posn" (a display position - not used for audio files). 

The "addr" field specifies the unique address of the ref- 
erenced object in the manner of a WWW hyperlink, ex- 
cept that in the preferred embodiment the object being 
referenced is generally stored on the same network 
node as the referencing document. The "posn" field in- 
dicates to the controller 122 where the object is to be 
positioned when the controller 122 displays the com- 
pound document 140. 

A flat file 142 encapsulates two types of information, 
content and attributes. Content is the data that corre- 
sponds to what the flat file expresses, such as text, 
sound or image data. The attributes define meta-knowl- 
edge about the content, including how the data is for- 
matted (which implies the type of its handler) and should 
appear/sound. For example, typical flat file attributes 
might indicate that the content is ASCII-, HTML- or Post- 
script-formatted text, MIDI or WAV-formatted sound, or 
GIF- or JPEG-formatted graphics. 

This attribute information is necessary as it indi- 
cates to the controller 122 how the content should be 
processed before being output to the multimode output 
device 1 30. 

A code fragment 144 also encapsulates two types 
of information, binary code for an embedded application 
that is executable on the controller 122 and attributes 
related to the binary code. For example, assuming that 
the code fragment 144a is a program that visually sim- 
ulates some device control process, the attributes for 
that code fragment might define the attributes of the win- 
dow in the referencing compound document in which the 
fragment 1 44a should run. In the preferred embodiment, 
embedded applications might include a program that 
runs an animated sequence showing the user how to 
select a complex device function using options dis- 



played on the display 1 30, a program that determines a 
sequence of document request messages to be issued 
when a user selects a particular option, or even an ex- 
pert system that can answer user questions about pro- 

5 gramming the devices 110. 

However, the most typical embedded applications 
1 44 transferred from an enclosure 1 1 0 to the remote 1 20 
are handlers that are compatible with flat files 1 42 trans- 
ferred from the same enclosure 110 to the remote 120, 

io generally in the context of a common compound docu- 
ment 140. This is because, to promote true universality 
(the idea that the remote 120 is able to display/execute 
any information provided by any enclosure 1 1 0), the pre- 
ferred embodiment assumes that the controller 122 

15 knows nothing about displaying any type of flat file pro- 
vided by the different enclosures 110. Operating under 
this assumption, for a referenced flat file 142 to be dis- 
played on the remote display 130, the controller 122 
must have access to a handler code fragment that is 

20 compatible with the referenced flat file 1 42. Once loaded 
by the controller 122, this handler will display/process 
the referenced flat file 142 and manage the real estate 
on the display 130 where the flat file is to be displayed. 
For example, a button-type flat file might include a 

25 digitized, JPEG image of the cover of the novel, "War 
and Peace" and the network address of a flat file that 
contains the entire text of the novel. This flat file's asso- 
ciated handler would be required to display the cover in 
an appropriate size and at an appropriate screen posi- 

30 tion on the display 1 30. This flat file's handler might also 
contain a method that is triggered whenever a user se- 
lects the region of the display 1 30 on which the cover is 
displayed, causing the controller 1 22 to issue a network 
message requesting the text of "War and Peace" using 

35 the address contained in the flat file. 

In the preferred embodiment, a handler's executa- 
ble code is not packaged along with the flat file(s) it is 
meant to handle, but is often referenced via a CF_ref in 
the compound document referencing its associated flat 

40 files. Alternatively, no CF_ref to the handler is required 
when a handler's identity can unambiguously be implied 
by the attributes of the referenced flat file alone (e.g. if 
the file is MIDI audio, to play the file, the controller 122 
will simply request a MIDI device driver on the network). 

45 One advantage of this approach (where data and han- 
dler executable code are separated) is that the remote 
control 1 20 retains backward compatibility with data files 
as they are typically constituted (i.e. , data files, such as . 
WAV files, are generally not packaged with their nan- 

50 dlers); while being able to handle seamlessly all manner 
of unknown data files. This arrangement also avoids the 
inefficiency of requiring the remote 1 20 to unnecessarily 
download the same handler multiple times. 

In the preferred embodiment, even the simplest 

55 type of flat files (e.g. scrolling ASCII text) requires a cor- 
responding handler code fragment running on the re- 
mote 120. Assuming that none of the corresponding 
handlers are locally available to the controller 122, in 
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the case of a complex compound document that refer- 
ences many flat files of different formats this fact could 
result in much downloading of handler code fragments. 
The preferred embodiment significantly reduces this risk 
by pre-storing in the memory 126 handler code frag- 5 
ments for common flat file data formats, such as .WAV 
and MIDI audio, ASCII text and JPEG graphics. 

Referring to Figure 4, there is shown a data flow 
diagram illustrating how a particular compound docu- 
ment 140a requested by the controller is processed be- 
fore being output to the multimodal display 130. As in 
Figure 3, this diagram shows data objects as rectangles 
and components that act on the data objects as dia- 
monds. Some of the diamonds appear multiple times, 
representing components, such as the remote controller 
122, that perform multiple actions on the data flow. 

Assume that, as discussed previously in reference 
to Figure 3, a user has selected an option such as 
n control_TV", that triggered the controller 1 22 to request 
the "controLTV" document 140a from the TV enclosure 
110a, and that, in response, the TV device has returned 
the compound document 1 40a depicted in Figure 4. This 
document 140a includes references 140.1 to two flat 
files, 142a, 1 42b, the first of which is to be positioned at 
point P1 on a window corresponding to the displayed 
compound document (note the document 1 42b has no 
associated position as it is an audio file). The document 
140a also includes references 140.2 to the code frag- 
ments 144a and 144b, with respective display positions 
of P3 and P4, respectively, and to the code fragments 
144c, 144d, which are handlers for the flat files 142a, 
142b, respectively. Finally, the document 140a includes 
a reference 140.3 to the compound document 140b, 
which is to be positioned at position P5 on the display 
130. The compound document 140b includes referenc- 
es a flat file 1 42c and the handler 1 44c for that file, which 
is identical to a handler already referenced in the com- 
pound document 140a. 

From the compound document 140a, the controller 
122 builds an assembled document 160, which is stored 
by the controller 122 in a volatile section of the memory 
126 as a stored assembled document 126.7 (FIG. 2). 
Generally, an assembled document 160 is an internal 
representation of a compound document 140 in which 
all of the references have been recursively resolved, ei- 
ther to a code fragment or to a flat file/handler pair. For 
example, in the case of the compound document 140a 
from Figure 1, the assembled document 160 includes 
assembled versions 142a', 142b', 142c 1 of the flat files 
142a, 142b, 142c; and assembled versions 144a'-d' of 
the code fragments 144a-d. Note that an assembled 
version of an object combines information from the ac- 
tual object (e.g., in the case of a flat file, the contents 
and attributes) and the object's output position, where 
relevant. This information allows the controller 122 (us- 
ing the appropriate handlers) to convert the assembled 
compound document 160 to a formatted output docu- 
ment 170 that can be directly output on the multimodal 



output device 130 in the manner envisioned by the de- 
signers of the particular device 110 being controlled. 

For example, in Figure 4, the output document 
170b, corresponding to a image that can be output on 
the display section 130a, defines windows at positions 
P1, P3, P4 and P5 where the output flat files 142a", 
142c" and executing codes fragments 144a", 144b" are 
simultaneously output by the controller 122. Each of 
these windows is under the control of a code fragment 
144a"-c". For example, the displayed flat files 142a", 
1 42c" are both displayed under the control of the exe- 
cutable code fragment/text handler 144c". The execut- 
ing code fragments 1 44a"-d" can be one of four types: 
(1 ) output code that, when executed, produces a visual 
or audible manifestation (e.g, graphical or sound simu- 
lations), (2) meta-knowledge code that can advise a us- 
er regarding legal interactions with the document in 
which the code fragment was embedded, (3) contextual 
code that can sense and indicate the processing context 
of the compound document in which the code fragment 
was embedded, and (4) handlers for embedded data 
(described above). One example of the first type of ex- 
ecutable code fragment is an animation that shows a 
user how to connect their cable to the back of the TV 
enclosure 1 10a. An example of the second type of code 
fragment is a smart help application that can tell the user 
how to select the functions displayed in the flat file 
1 42a". Finally, an example of the third type of code frag- 
ment is a smart icon, or cartoon, that monitors controller 
traffic to and from the remote control 120 and then si- 
grrals a user, for example, by waving an animated hand, 
that transmissions are underway. 

Because multiple code fragments and flat files 
might need to be output to the output device 1 30 simul- 
taneously, the real-time kernel 126.5 is required to 
achieve perceptibly real-time output and to avoid user 
frustration. Finally, because the flat file 1 42b" is a sound 
file (e.g., a .WAV file) the controller 1 22 processes it with 
an appropriate handler 144d" and outputs the resulting 
real-time sound 170b to the audio section 130a of the 
multimodal output device. 

The method by which the controller builds an as- 
sembled document is illustrated in Figure 5, which is a 
flow chart of the method of the preferred embodiment. 

As mentioned above in reference to Figure 3, the 
first step in the method of the preferred embodiment is 
to display the default document(s) 1 26.2, from which the 
user can select a number of basic options (250). Once 
a user has selected one of these options, the controller 
1 22 fetches from the network the appropriate document 
(252). For example, as mentioned above, when the user 
selects the controLTV option, the controller 122 issues 
a document request message for the control_TV docu- 
ment tothe TV 1 1 0a. Upon receiving the compound doc- 
ument 140, the controller 122 determines whether it has 
local access to all of the documents that are needed to 
process the first compound document (254). If not 
(254-NO), the controller 122 pulls in from the network 
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all of the needed documents that are not locally availa- 
ble (256). For example, in the case illustrated in Figure 
4, prior to displaying the control_TV document, the con- 
troller 122 downloads from the TV enclosure 110a the 
referenced flat files 142a-c and the code fragments 
144a-d. Each of these objects are temporarily stored in 
the memory 1 26 as the temporarily stored downloaded 
objects 126.6 (258). 

In the preferred embodiment, the required docu- 
ments discussed in reference to Figure 5 could encom- 
pass two different classes of information. First, a re- 
quired document might be an object (flat file, code frag- 
ment or other compound document) that is explicitly ref- 
erenced in a downloaded compound document. These 
explicitly referenced objects are included in the assem- 
bled document 160. This inclusion process is recursive 
and can proceed until a leaf level object (i.e, one with 
no referenced compound documents objects) is finally 
retrieved by the controller 122. 

Second, a required document could be an unrefer- 
enced handler for the particular type of file just retrieved, 
where the identity of a required handler is determined 
by an embedded document's attribute field. For exam- 
ple, if the flat file 142b were a .WAV audio file, and the 
controller 122 did not currently have local access to a . 
WAV audio file handler and one was not referenced in 
the compound document 140a, the controller 122 would 
issue a request to all of the networked devices 110 for 
an appropriate audio handler. Upon receiving the han- 
dler, the controller 1 22 would then install the handler and 
begin executing the handler on the contents of the re- 
lated flat file 142. Alternatively, the controller 122 can 
temporarily store the handler in the memory space 
126.7 with the related flat file and play the flat file when 
the rest of the assembled document 160 is ready to be 
output. 

After all of the embedded compound documents 
have been recursively retrieved (i.e., after building the 
assembled document 1 60), the controller 1 22 generates 
the complete output document 170, which it outputs to 
the multimodal output device 1 30 (260). Based on the 
types of the files, different parts of output document 1 70 
are directed to different sections of the output device. 
For example, in the case where the flat file is an audio 
file, its corresponding output document 170b is directed 
to the audio section 1 30b of the output device 1 30. The 
other embedded documents of the document 140a, all 
of which are graphically-oriented, are included in an out- 
put document 170a that is output to the display section 
130a of the output device 130. In the case were some 
of the flat files 142 and/or executable code fragments 
144 involve the generation of video data, those objects 
are also output to the display section 1 30a. Alternatively, 
the controller 1 22 can output to the output device com- 
pleted parts of the assembled document 160 as soon 
as they are ready. 

Once the completed assembled document has 
been output to the output device 1 30, the controller 1 22 



provides for user interaction whereby the user can se- 
lect various options (260) from the user input device 
1 32. If the user selects an option requiring the controller 
to fetch a document over the network, the aforemen- 
5 tioned steps of the method are all repeated. Of course, 
in some situations, such as illustrated in Figure 3, the 
user can select options, such as "set time" 212a', "set 
display params" 212b* or "set audio params" 212c', that 
are local to the remote control device 1 20. 

ALTERNATIVE EMBODIMENTS 

The preferred embodiment is directed to net- 
worked, computer-controlled, home entertainment 
equipment and a compatible, computer-controlled re- 
mote control device that can be used to operate the var- 
ious components based on downloaded compound doc- 
uments. However, the system and method of the pre- 
ferred embodiment is also applicable to networks of 
computers. 

For example, in one alternative embodiment, illus- 
trated in Figure 6, the device enclosures 110 are ex- 
changed for Web servers 31 0 and the remote control 
120 for a Web client 320. A Web browser 322 running 
on the Web client performs the various compound doc- 
ument file retrieval, processing and output operations 
ascribed above to the controller 122. Some of the con- 
troller's actions, such as downloading uniquely specified 
documents as a result of a user selecting a reference 
are already performed by many Web browsers. Howev- 
er, unlike the alternative embodiment, no Web browsers 
are configured to download then run (optionally, in real 
time), executable code fragments that are referenced or 
embedded in Web documents. These capabilities are 
provided by the alternative embodiment in three differ- 
ent contexts. 

First, a reference to a code fragment can be em- 
bedded in an HTML (compound) document so that the 
reference is displayed by the Web browser as a hyper- 
link, the selection of which causes the browser to down- 
load and run the code fragment. Second, a reference to 
a code fragment can be embedded in an HTML docu- 
ment so that the code fragment is automatically re- 
trieved and run by the Web browser upon receiving the 
Web page that includes the code fragment. This type of 
included code fragment is useful for providing dynamic 
knowledge that is always associated with a particular 
page no matter what browser is being used. For exam- 
ple, a Web page that makes use of this aspect of the 
present invention might provide an embedded applica- 
tion that selects the best (i.e., least busy) server for a 
referenced document from among several alternatives 
based on known (to the embedded program) network 
activity patterns and the time of day. Finally, as in the 
situation of the audio file mentioned above, a Web 
browser operating in accordance with the alternative 
embodiment can download appropriate handlers based 
on the attributes of a previously received data file or 
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when the handler's URL is specified in the same HTML- 
like document that references a compatible flat file. 

One aspect of the preferred embodiment that would 
function differently in the WWW environment is server 
registration, which is the process whereby servers make 5 
their presence known to the client. In the WWW envi- 3. 
ronment, server and client connections are not direct. 
Consequently, new Web servers' name servers cannot 
simply register their names and addresses with the Web 
clients' name servers. However, another simple regis- io 
tration process can be implemented wherein the Web 
servers automatically register with an appropriate direc- 
tory, such as Yahoo, upon going online or changing their 
Internet address. The client's name server can then pe- 
riodically register new servers from that directory. is 

In summary, the alternative embodiment of Figure 
6 is a Web browser and a set of compatible HTML-like 
compound documents that implement the method and 
system of the present invention. Specifically, the alter- 
native Web browser 322 can process HTML-like docu- 20 4. 
ments in which executable code fragments are explicitly 
referenced or implied in an attribute field associated with 
a data file. In the alternative embodiment, few modifica- 
tions are required to existing Web servers 310, the only 
real change on the Web server side being the form of 25 
the HTML documents that are stored on the various 
servers. That is, in the alternative embodiment, new 
fields are added to the HTML documents to embed ex- 5. 
ecutable code fragments and to provide any additional 
file attributes that might be needed for appropriate han- 30 
dlers to be downloaded by the Web browsers 322, there- 
by allowing Web browsers to display/process any type 
of flat file without any prior knowledge of file formats nor 
local access to an appropriate handler. 



Claims 

1. In a network that includes at least one server and 

at least one client coupled to said network, said 40 
servers and said clients each including a computer 
controller and a memory and having a unique net- 
work ID, said memory for said client comprising: 

a compound document containing at least 
one reference to an embedded application, said ref- 45 
erence being selected from an explicit reference or 
an implicit reference; such that, upon receiving said 
compound document from said server over said 
network, said client is configured to download and 
execute on said client computer controller said ref- so 
erenced embedded application. 

2. The memory of claim 1 , said compound document 
including a reference to a flat file, wherein one of 
said explicit references to an embedded application 55 
is a reference to a handler for said flat file; such that, 
upon receiving said compound document from said 
server over said network, said client is configured 



to download said flat file and said handler and then 
process said flat file using said handler, said client 
being unable to process said flat file without said 
handler being resident on said client. 

The memory of claim 1 , said compound document 
including a reference to a flat file of a determinable 
type, wherein one of said implicit references is a ref- 
erence to a handler for said flat file that is implied 
by said determinable type; such that, upon receiv- 
ing said compound document from said server over 
said network, said client is configured to download 
said flat file and, based on said determinable type, 
download from one of said servers said handler and 
then process said flat file using said downloaded 
handler, said client being unable to process said flat 
file without said handler being resident on said cli- 
ent. 

The memory of claim 1, said compounding docu- 
ment having references to a plurality of embedded 
documents including said embedded application, 
said memory including a requestor module config- 
ured to download into said client those of said plu- 
rality of embedded documents not stored in said cli- 
ent's memory. 

A system for transmitting embedded documents 
over a network that includes at least one server and 
at least one client coupled to said network, said 
servers and said clients each including a computer 
controller and a memory and having a unique net- 
work ID, said system comprising: 

a requestor that is executable on a client's com- 
puter controller, said requestor being config- 
ured to control messages issued by said client 
on said network, said messages including a 
document request message instructing a par- 
ticular one of said servers to transmit to said 
client a particular compound document, where- 
in a compound document includes references 
to a plurality of said embedded documents, said 
embedded documents having types selected 
from executable code fragments, flat docu- 
ments and other compound documents; and 
a provider that is executable on a server's com- 
puter controller, said provider being configured 
to be responsive to said messages directed to 
said server, said provider being configured to 
respond to said document request message by 
causing said server to transmit to said client 
said particular compound document; 
such that, upon receiving said particular com- 
pound document, said requestor is configured 
to retrieve using document request messages 
at least a subset of said embedded documents 
referenced by said particular compound docu- 
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ment that are not stored in said client's memory 
and then form an assembled compound docu- 
ment including said flat documents and said ex- 
ecutable code fragments. 

6. The system of claim 5, wherein said client further 
comprises: 

a multimodal output device controlled by said 
requestor; 

said requestor being configured to output to 
said output device an output compound docu- 
ment, wherein said output compound docu- 
ment includes: 

text images corresponding to said flat doc- 
uments included in said assembled com- 
pound document; and 
manifestations of said executable code 
fragments, said manifestations resulting 
from execution of said code fragments by 
said requestor. 

7. The system of claim 5, wherein: 

said requestor is a multi-tasking operating sys- 
tem; and said client further comprises: 
a multimodal output device controlled by said 
requestor; 

said requestor being configured to output to 
said output device an output compound docu- 
ment, wherein said output compound docu- 
ment includes: 

text images corresponding to said flat doc- 
uments included in said assembled com- 
pound document; and 
manifestations of said executable code 
fragments, said manifestations resulting 
from execution of said code fragments by 
said requestor; 

such that said requestor is configured to con- 
currently execute a plurality of said code frag- 
ments in real time while outputting said text im- 
ages. 

8. The system of claim 5, wherein a retrieved execut- 
able code fragment encapsulates all data and rou- 
tines necessary for its execution and is encoded so 
that it is executable by said client. 

9. The system of claim 6, wherein at least a subset of 
said executable code fragments is selected from 
the group consisting of: 

executable sound clips; 
executable simulations; 



executable movie clips; 
executable ticker tape-style messages; and 
handlers for said flat documents; 
said client being configured to execute said ex- 
5 ecutable code fragments. 

10. The system of claim 5, further comprising a name 
server coupled to said client, said name server 
maintaining a list of at least a subset of said servers 
10 on said network that are responsive to messages 
from said client, said name server being configured 
to return information from said list to said client 
whenever requested by said client. 

15 11. The system of claim 5, wherein said servers are 
Web servers, said clients are Web clients and said 
network is the Internet. 

12. A method for transmitting embedded documents 
20 over a network that includes at least one server and 

at least one client coupled to said network, said 
servers and said clients each including a computer 
controller and a memory and having a unique net- 
work ID, said client also including a user input de- 
25 vice, said method comprising the steps of: 

(1 ) said client issuing a document request mes- 
sage on said network to a particular server, said 
document request message designating a par- 

30 ticular compound document to be returned to 

said client by said particular server; 

(2) in response to said document request mes- 
sage, said particular server returning said par- 
ticular compound document to said client, 

35 wherein a compound document includes refer- 

ences to a plurality of embedded documents, 
each of said embedded documents having a 
type selected from executable code fragments, 
flat documents and other compound docu- 

40 ments, 

(3) upon receiving said compound document 
from said server, said client issuing a plurality 
of document_request messages to retrieve via 
said network any flat documents and executa- 

45 ble code fragments referenced by said com- 

pound document that are not stored in said cli- 
ent's memory; and 

(4) forming an assembled compound document 
including said flat documents and said execut- 

50 able code fragments. 

13. The method of claim 12, wherein said method fur- 
ther comprises the step of: 

said client outputting to a multimodal output 
55 device said assembled output compound docu- 
ment, said outputting step including: 

displaying on said multimodal device text imag- 
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es corresponding to said flat documents includ- 
ed in said assembled compound document; 
and 

executing said executable code fragments and 
outputting to said multimodal device manifes- 5 
tations of execution of said code fragments. 

14. The method of claim 13, wherein said step of dis- 
playing said flat documents is performed by at least 

a subset of said executing code fragments, each of 10 
said subset being a data handler compatible with at 
least one of said flat documents. 

15. The method of claim 12, wherein said method fur- 
ther comprises the step of: 15 

said client outputting to a multimodal output de- 
vice said assembled output compound docu- 
ment, said outputting step including: 

20 

displaying on said multimodal device text 
images corresponding to said flat docu- 
ments included in said assembled com- 
pound document; and 

executing said executable code fragments 25 
and outputting to said, multimodal device 
manifestations of execution of said code 
fragments; 

such that said executing step is performed in 30 
real time and concurrently with said displaying 
steps. 

16. The method of claim 15, wherein said step of dis- 
playing said flat documents is performed by at least 35 
a subset of said executing code fragments, each of 
said subset being a data handier compatible with at 
least one of said flat documents. 
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